Finite-sample and asymptotic analysis of generalization ability with an application to penalized regression
نویسندگان
چکیده
In this paper, we study the generalization ability (GA)—the ability of a model to predict outcomes in new samples from the same population—of the extremum estimators. By adapting the classical concentration inequalities, we propose upper bounds for the empirical out-ofsample prediction error for extremum estimators, which is a function of the in-sample error, the severity of heavy tails, the sample size of in-sample data and model complexity. The error bounds not only serve to measure GA, but also to illustrate the trade-off between insample and out-of-sample fit, which is connected to the traditional bias-variance trade-off. Moreover, the bounds also reveal that the hyperparameter K, the number of folds in K-fold cross-validation, cause the bias-variance trade-off for cross-validation error, which offers a route to hyperparameter optimization in terms of GA. As a direct application of GA analysis, we implement the new upper bounds in penalized regression estimates for both n≥ p and n < p cases. We show that the L2 norm difference between penalized and un-penalized regression estimates can be directly explained by the GA of the regression estimates and the GA of empirical moment conditions. Finally, we show that all the penalized regression estimates are L2 consistent based on GA analysis.
منابع مشابه
Use of Two Smoothing Parameters in Penalized Spline Estimator for Bi-variate Predictor Non-parametric Regression Model
Penalized spline criteria involve the function of goodness of fit and penalty, which in the penalty function contains smoothing parameters. It serves to control the smoothness of the curve that works simultaneously with point knots and spline degree. The regression function with two predictors in the non-parametric model will have two different non-parametric regression functions. Therefore, we...
متن کاملPenalized Quantile Regression Estimation for a Model with Endogenous Individual Effects
Abstract. This paper proposes a penalized quantile regression estimator for panel data that explicitly considers individual heterogeneity associated with the covariates. We provide conditions under which the estimator is asymptotically Gaussian, and the harshness of the penalization can be determined by minimizing asymptotic mean squared error. We investigate finite sample and asymptotic perfor...
متن کاملPenalized Bregman Divergence Estimation via Coordinate Descent
Variable selection via penalized estimation is appealing for dimension reduction. For penalized linear regression, Efron, et al. (2004) introduced the LARS algorithm. Recently, the coordinate descent (CD) algorithm was developed by Friedman, et al. (2007) for penalized linear regression and penalized logistic regression and was shown to gain computational superiority. This paper explores...
متن کاملAsymptotic algorithm for computing the sample variance of interval data
The problem of the sample variance computation for epistemic inter-val-valued data is, in general, NP-hard. Therefore, known efficient algorithms for computing variance require strong restrictions on admissible intervals like the no-subset property or heavy limitations on the number of possible intersections between intervals. A new asymptotic algorithm for computing the upper bound of the samp...
متن کاملComparison of Ordinal Response Modeling Methods like Decision Trees, Ordinal Forest and L1 Penalized Continuation Ratio Regression in High Dimensional Data
Background: Response variables in most medical and health-related research have an ordinal nature. Conventional modeling methods assume predictor variables to be independent, and consider a large number of samples (n) compared to the number of covariates (p). Therefore, it is not possible to use conventional models for high dimensional genetic data in which p > n. The present study compared th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1609.03344 شماره
صفحات -
تاریخ انتشار 2016